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ABSTRACT 

This paper deals with public-key steganography in the pres- 
ence of a passive warden. The aim is to hide secret mes- 
sages within cover-documents without making the warden 
suspicious, and without any preliminar secret key sharing. 
Whereas a practical attempt has been already done to pro- 
vide a solution to this problem, it suffers of poor flexibil- 
ity (since embedding and decoding steps highly depend on 
cover-signals statistics) and of little capacity compared to 
recent data hiding techniques. Using the same framework, 
this paper explores the use of trellis-coded quantization tech- 
niques (TCQ and turbo TCQ) to design a more efficient 
public-key scheme. Experiments on audio signals show great 
improvements considering Cachin's security criterion. 

1. INTRODUCTION 

Steganography is the art of hiding secret information within 
innocuous documents. It is often schemed by Simmons' 
prisoners' problem |[l]. Alice and Bob are in prison and 
want to finalize a common escape plan, but their commu- 
nications are filtered by a warden named Wendy. If she 
considers a transmitted document suspicious, she stops the 
communication channel. Prisoners must exchange secret 
information using innocuous contents. This purpose is the 
definition of steganography. Unlike watermarking, steganog- 
raphy does not involve any robustness. Its goal is trans- 
parency: statistics of stego-signals must look as natural as 
original cover-signals, and visual (or auditive) quality can 
not suffer of any default. 

Whereas a lot of solutions for symmetric steganogra- 
phy (i.e. a private key is shared by Alice and Bob) have 
been proposed, very few practical proposals for the asym- 
metric version (a public key for embedding and a secret key 
for reading) exist. The problem has been theoretically ana- 
lyzed, but - as far as we know - only one practical attempt 
has been done [2J. However that work suffers of important 
limitations. In particular, its transparency depends of a im- 
portant knowledge of cover-signal statistics. 

The aim of this article is to design a general purpose 
public-key scheme. Section [2] recalls the problem model. 



i.e. communication through channel with side information. 
Next section presents a practical solution to the public -key 
steganography problem. Section|4]concludes this work. 

2. PROBLEM MODEL: CHANNEL WITH SI 

Alice wants to transmit a message m. To this end, it is 
first encoded to w S M". Let us consider an i.i.d. cover- 
signal denoted x S M", modeled by the random variable 
X. Whereas steganographic schemes are not supposed to 
be robust (i.e. resilient to attacks like lossy compression, 
filtering, etc.), our initial assumption on host signal statis- 
tics implies the use of a decorrelating transform, resulting 
an additional quantization noise denoted z, which is mod- 
eled by Z ~ M(Q,N). The resulting stego-signal is de- 
noted y' = w + X + z. In classical encoding scheme, the 
capacity of such channel is very low due to invisibility con- 
straint. Nevertheless, since x is perfectly known at the en- 
coder, steganography represents a channel with side infor- 
mation available at the encoder. As demonstrated by [3^^, 
its capacity is 



C = ^ log2 



P 
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where P is the embedding power constraint X]"=i '^I*]^ — 
P. Therefore, side information x does not have any influ- 
ence on the capacity. The use of informed encoding may 
then lead to important capacity or invisibility improvements 
for steganography. 

Practical - but sub-optimal - schemes for side informed 
encoding have then been proposed by watermarking com- 
munity. A famous one is scalar Costa scheme (SCS) [5 1. Its 
principle is to use scalar quantization to define an informed 
codebook. For simplicity, let us consider binary transmis- 
sion: m e {0, 1}". For a given quantization step A, SCS 
defines U, product of scalar codebooks U — U[l] x U[2] x 
... X U[n] with 
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where d G [— A/2,+A/2]" is dither noise used as a pri- 
vate key. Each possible message m is associated to a sub- 
codebook Um C U, defined by 
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generator 
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To encode m, the chosen codeword is defined as 



u — arg mm ||u — x| 



(3) 



(4) 



and the added signal is w = a{u* — x). 

Experiments show that SCS poorly performs for uncoded 
messages: for an embedding rate of 1 bit per cover-element, 
P/N must be greater than 14 dB to get a bit error rate lower 
than 10~^ (9.2 dB away from theoretical capacity). It must 
be associated to a efficient channel code, but this reduces 
embedding rate. 

3. PUBLIC-KEY STEGANOGRAPHY 

Symmetric steganography suffers of an important drawback: 
Alice and Bob must share a secret key before any secret 
transmission. Since their communication are supervised by 
Wendy, this secret must be transmitted before their impris- 
onment. Public -key steganography permits to avoid this 
transmission. Each transmitter owns a pair of cryptographic 
keys (kpub, kpi-v). The public one can be freely communi- 
cated without any impact on secrecy, and kprv must be kept 
secret. Everyone can send a secret message to Bob, but only 
Bob is able to read it. 

3.1. Previous work 

From ideas of |]6l, Guillon et al. proposed a practical 
framework for public-key steganography, based on asym- 
metric cryptography and a steganographic scheme using SCS. 
It consists in two steps (see Fig.[T]i: 

Initialization phase A secret key kt^p is randomly chosen. 
It is encrypted using an asymmetric crypto-algorithm 
with public key kpub. The random-like binary vector 
k' = crypt(ktmp, kpub) is embedded into cover-signal. 

Permanent pliase The secret message m is transmitted us- 
ing SCS. Embedding is done by using a secret dither 
noise d (see Eqn. which is generated with the 
seed ktmp. 

Whereas the second step do not represent any major issue 
(SCS with secret dithering presents good security proper- 
ties II2I), initialization phase needs to embed public informa- 
tion without any noticeable change on stego-signal statistics 
and quality. For that purpose, Guillon et al. considered the 
use of SCS embedding with a = 1/2. But in the case of 



Initialization 



Permanent 



Fig. 1. Public-key framework from f2\. Permanent phase is 
initialized by a random temporary key k. 



non-uniform cover-signal pdf, this leaves easily noticeable 
artifacts on stego-signal statistics (e.g. for Gaussian case. 



see Fig. 2(a) 1. This problem does not appear in the perma- 
nent phase thanks to the secret dither noise d. Their solu- 
tion is to use a compressor before embedding to equalize the 
pdf, then embed k' and apply the inverse compressor to get 
the original pdf shape. This implies that compressor design 
must be robust to data embedding and quantization noise. 

3.2. Trellis-coded quantization for initialization phase 

Since SCS partioning is regular, it introduces artifacts on 
marked signals. The approach that we propose is to use 
trellis-coded quantization to design a pseudo-random space 
partitioning. TCQ techniques for side informed coding com- 
bine robustness, SCS-like capacity and ease of implementa- 
tion. Let us consider a trellis defined by a transition func- 
tion: 



S X {0,1} 
t : {s^,m.[^]) 
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(5) 



where S = {0, 1, . . . ,2''"^} is the set of possible trellis 
states. Unlike SCS, dither noise d is no more random but 
a function of current state and input symbol (a private key 
may be also introduced in the function for additional secu- 
rity purpose): 



S X {0,1} 
o : (si,m[z]) 



[-A/2,+A/2] 
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Sub-codebooks are then defined by 

Uj[i] = {kA + o(sj, m[i]), k e Z} 

and closest codeword u* G Um to s is computed using a 
Viterbi algorithm with strong a priori metrics to ensure de- 
coded codeword to belong to Um- 
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Stego-signal is given by x = a (u* — s). Experiments 
show that the best a parameter in term of robustness is 



Pj (P+N) like in the original Costa scheme O. As demon- 



strated by Fig. 2(b) no statistical artifact is noticeable using 
this technique. 

On the other hand, the use of hnear embedding (i.e. 
w = a{u* — x)) leaves another clue for Wendy. Let o?x be 
the distance between an original cover-signal x and its clos- 
est codeword u E U. We have E [dx] = A^/48 (since two 
quantizers are available for each input data, thanks to trellis 
coding). Let dy be the distance between a stego-content y 
and its closest codeword. To avoid suspicion, we must en- 
sure dy ~ c?xi i-B- use a — 1/2. But this parameter does 
not permit to reach the Voronoi region associated to u* in 
most practical cases. A larger a parameter must be chosen, 
leading to dy < dx- 

The chosen embedding technique is similar to Miller's 
iterative solution Q and it is illustrated by Fig. [3] The idea 
is to bring cover-signal x into the chosen area, at a distance 
\/A^ + e of its frontie|[] It iterates as follows: 

1 . Let y = X and let u* € U^^ be the targeted codeword. 

2. Find the closest codeword u e to y. If u = u*, 
stop. 
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(a) SCS embedding with a = 0.5 




3. Letd 



and f3 



+ VN + e. 



\u* - u| ' 2 

4. Add /3d to y and go to step 2. 

Obviously, this technique is not very efficient in term of ca- 
pacity. Nevertheless, the size of the message transmitted 
during this first phase is not large, since it just represents a 
pseudo-random generator seed. 

3.3. Powerful dirty paper codes for permanent phase 

Turbo TCQ [8 1 is a recent source coding technique inspired 
by iterative channel decoding algorithms. The turbo trellis- 
coded quantizer is composed of two parallel TCQ trellises. 
The first one works with the signal y' to be decoded, while 
the second one decodes an interleaved version of y'. A pos- 
teriori metrics from first quantizer are used as a priori met- 
rics for the second. The process is repeated until both a 
posteriori metrics are similar 

We use this technique to design a powerful dirty paper 
code. For an embedding rate of 1 bit per cover element, the 
use of turbo TCQ leads to a gain of 5.5 dB compared to 
classical SCS (see Fig. [4|l. Since robustness is not a prior- 
ity in the case of steganography, this gain leads to a better 
transparency. 

' Since the realization of Z is unknown during embedding, we introduce 
e to liave an additional security margin. 



(b) Embedding using TCQ codes with a = 0.7 

Fig. 2. Resulting probability density functions after embed- 
ding (X - 7V(0, 10^) and P = 10*). A 29-state trellis is 
used for TCQ. 



3.4. Results: security from Cachin's point of view 

Cachin's paper |9 1 is a pioneer try for the definition of a se- 
curity criterion for steganography. Considering i.i.d. cover- 
signal (like this work), it defines security as the relative en- 
tropy between cover and stego-signal. A steganographic 
scheme is said to be e-secured against passive warden if 
DKLiPvWPx) < e, where 



DKLiPyWPx) 



(8) 



cec 



In order to evaluate e-security of the proposed permanent 
phase, we used our technique for audio steganography. Two 
test samples were used: one is a smooth bass solo ("Jazz") 
and the other is an powerful guitar play ("Heavy metal"). 
Both are 5 seconds long PCM samples, 44.1 KHz - 16 bits. 
Embedding is performed like Guillon's practical proposi- 
tion: MDCT on analysis windows of 512 samples, and co- 
efficients are grouped into 32 sub-bands during 10 windows 
(i.e. each sub-band contains 160 coefficients). Each sub- 
bands are supposed to be Laplacian distributed. A random 
binary message is embedded using SCS and turbo TCQ codes. 



Fig. 5. Performance of permanent phase in term of e- 
security for SCS and proposed technique. 



Fig. 3. Data embedding for initiaUzation phase using a 
Monte Carlo technique. 
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Fig. 4. Bit error rates for trelUs-coded quantization codes 
compared to SCS. Embedding rate is 1 bit per cover ele- 
ment. 



Quantization step A is chosen to get a bit error rate lower 
than 10^^, and embedding rate is modified using spread 
transform (like ST-SCS |5|). Fig. |5]shows the resulting e- 
security. We can see that the security gain is similar for both 
test samples. From Cachin's point of view, our turbo TCQ 
codes are about ten times much secure than classical SCS 
for high embedding rates. 

4. CONCLUSION 

This paper provided a practical and efficient solution to the 
public-key steganography problem. Within an initial frame- 
work based on asymmetric cryptography, we improved its 
generality and its efficiency. Unlike [2|, the proposed ini- 
tialization phase (transmission of the secret key) is inde- 



pendent of cover-signal statistics thanks to a pseudo-random 
space partitioning. And the permanent phase (transmission 
of the secret message) is more secure concerning Cachin's 
criterion. 
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